home Today's News Magazine Archives Vendor Guide 2001 Search isdmag.com

Editorial
Today's News
News Archives
On-line Articles
Current Issue
Magazine Archives
Subscribe to ISD


Directories:
Vendor Guide 2001
Advertiser Index
Event Calendar


Resources:
Resources and Seminars
Special Sections


Information:
2001 Media Kit
About isdmag.com
Writers Wanted!
Search isdmag.com
Contact Us





Articulating Hierarchical Design for SOCs

By Jake Buurma
Posted  04/27/01, 04:15:04 PM EDT

Designers of system on a chip (SOCs) use many design methodologies, flows, and tools to achieve timing closure. The current physical synthesis tools attack the problem of block-level timing convergence from the register-transfer level (RTL) to GDSII. However, the new challenge for integrated circuit (IC) designers is at the chip level, including support for multiple supply voltages, hierarchical integration of multiple soft blocks, hierarchical signal and design integrity, and delay prediction problems. Since many functional blocks and cores are integrated at the chip level, board-level problems are visible down to the chip level. Next-generation tools must deal with these chip-level problems to ensure rapid design closure with a minimum of iterations.

The need for hierarchical design

While flat design is a highly effective way to implement a design, it becomes virtually impossible to use when the design grows beyond 5 million gates. For designs of this size and larger, in particular where multiple teams are working on a large design, a hierarchical block-based design approach should be taken.

Synthesis tools with built-in placement algorithms provide for more accurate timing predictions of the placed-and-routed design, requiring less iteration between the front- and back-end phases of the design flow. Of course, it's easy to predict timing by placing all of the blocks in the design right next to each other, but this leaves no space for global routes. So the placement algorithm needs to be aware of global routing. Because the choice of metal layer can have a huge effect on timing, the global router also needs to know about detailed routing. Thin wires on inner layers couple with each other and the substrate and, due to this increased wire coupling, the detailed router needs to know about parasitic extraction and analysis. All of this means that the front-end and back-end are no longer separate phases of the design process. It also means that next-generation tools must be tightly integrated to enable iteration and communication between all phases of design.

Designers typically synthesize blocks in isolation from other designers, and then pass the synthesized blocks to a physical design team for assembly. But chip assembly places additional constraints on the design, so blocks invariably need to be re-synthesized. An integrated, hierarchical synthesis/place-and-route methodology allows front-end and back-end designers to share data much earlier in the design process, resulting in a greater chance of first-pass or few-pass timing success.

Giving the front-end team information about global restrictions-including global wire lengths, global buffers, power routing, impact on cell placement, and intra-block routing-reduces the chances of re-iteration. Additionally, a hierarchical methodology allows the designers to perform power planning and routing, global buffering, and detailed routing on the top-level design using information drawn from the RTL description.

Shell/core partitioning is a hierarchical timing analysis methodology (see Figure 1). It creates an accurate snapshot of inter-block timing by separating interface (shell) logic from internal (core) logic. The first step in shell/core partitioning involves identifying the shell and core logic of the physical partitions. Shell logic is the part of the design that exists between the physical partition boundary and the first input register. On the output side, logic between the last register and the physical partition boundary is also shell logic. The part of the design between the input and output registers is core logic. If there are no registers in the block, or if the logic does not pass through a register, the entire block is marked as shell logic.

This shell/core separation provides significant capacity improvements in chip-level timing analysis and budgeting. Timing analysis reads only the netlist for the shell modules, leaving the core module instances as black boxes. In typical designs, the average large physical block contains 20 percent interface logic and 80 percent internal logic. Therefore, timing analysis can provide a 5x increase in capacity with a corresponding increase in run-time speed.

Dynamic timing abstraction is also used in hierarchical design and typically gives a 2x performance increase to chip-level timing analysis, and improves capacity. By focusing only on chip-level paths, dynamic timing abstraction automatically avoids the unnecessary overhead of analyzing any paths contained entirely within blocks. Designers specify the blocks in the design that should be abstracted, then run timing analysis.

Floorplanning and signal integrityRTL floorplanning is based on an estimation of the design size; any inaccuracies in this estimation will invalidate a floorplan and most of the design work needed to create it. In contrast, dynamic floorplanning ensures that top-down constraints converge with the module-up implementation, by enabling the concurrent design of soft blocks (or P&R units) within the context of how they will be used. The use of dynamic floorplanning prevents estimation inaccuracies by taking physical effects into account. Physical synthesis technology forms the basis for an in-context estimation of physical constraints and improves the accuracy of size estimations.

As with timing closure, signal integrity problems have increased as feature size has decreased. For example, at 0.25 micron, only a few thousand nets may require special consideration for parasitic effects-these nets could be dealt with manually. With 0.18-micron processes, this number jumps to the hundreds of thousands. At 0.13 micron, the majority of nets have problems and reduced supply voltages compound the potential for crosstalk and substrate coupling. A methodology centered on analysis and correction breaks down at this point.

The solution is a correct-by-construction approach that supports both flat and hierarchical signal and design integrity. Signal integrity effects include crosstalk noise, crosstalk effect on delay, and IR drop. Major design integrity effects include electromigration, wire self-heat (sometimes called signal line electromigration), and hot-electron effects. Next-generation, methodologies must incorporate heuristic algorithms that avoid introducing these effects into the design during the P&R process. A further independent analysis and repair step validates the earlier prevention steps and fixes any remaining problems.

To minimize power dissipation, designers typically juggle several different supply rails, reserving the higher voltages for the fastest sections of the design, or specific voltages for analog/mixed-signal sections. The already difficult problem of power distribution intensifies.

Minimizing power dissipation involves interactive and automatic assignment of power and ground connections in the physical design. But power connections are not specified in the logical netlist and voltage domains don't necessarily coincide with hierarchical block boundaries. To remedy this, automatic tools use several different ways to determine supply connections for newly added gates. Supply connections can be derived from connections to upstream and downstream gates, or can be inferred from area. Correspondence between net and pin names can help establish the supply rail connection. Physical isolation of gates can also imply connection to a common supply. During floorplanning, visualization of the supply domains assists the place-and-route process, as do reports on utilization and power consumption per domain.

Additionally, timing characteristics may vary throughout the design according to the supply voltage used in each block. This means that tools must support concurrent timing analysis and optimization of multiple voltage domains. In addition, outputs to physical verification tools must communicate information on multiple supply voltages.

An important technology for SOC design is a single, unified database which supports hierarchical data from RTL descriptions through to the placed-and-routed design. Maintaining the design hierarchy also shortens the design cycle. In addition to the hierarchical place-and-route flow described here, the unified database and the design tools integrated with it must support existing flat and hierarchical use. To increase predictability through the design flow, the database must also support uniform timing, power analysis, and signal integrity engines; a full chip, incremental engine is necessary for synthesis. With the addition to the database of a public-domain application procedural interface (API), designers can engineer applications that tightly integrate with the design flow (see Figure 2).

Beyond traditional

Traditional design flows often include a range of disparate tools procured from multiple vendors or created internally; maintaining these tools can cause major headaches for design teams. By integrating these tools from the front-end flow to the back-end processes, time-consuming tasks including importing, exporting, and translating data in the process can be streamlined.

It's also an advantage if the entire design tool suite is brought together under a common user interface. This gives everyone on the design team a consistent environment with access to all design data throughout the flow. Use of an industry standard command language, such as Tcl, also helps.

As design size escalates, management of the physical and logical design domains becomes more difficult. A hierarchical, block-based design flow assists with design management and helps enable concurrent design at the top level and at the module level. An integrated, hierarchical synthesis and P&R methodology avoids circular dependencies between top-level and module-level design and helps to complete timing convergence at the chip level. The support for concurrent design methodology enables a parallel hand-off between ASIC designer and ASIC vendor for quick timing closure.


Gerald (Jake) Buurma is senior vice president of worldwide research and development at Cadence Design Systems, Inc. (San Jose, CA).

He has 29 years of industry experience in IC design and EDA.

   Print Print this story     e-mail Send as e-mail   Back Home

Sponsor Links

All material on this site Copyright © 2001 CMP Media Inc. All rights reserved.